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Abstract 

The probability distribution of stock price changes is studied by analyzing a 
database (the Trades and Quotes Database) documenting every trade for all 
stocks in three major US stock markets, for the two year period Jan 1994 - Dec 
1995. A sample of 40 million data points is extracted, which is substantially 
larger than studied hitherto. We find an asymptotic power-law behavior for 
the cumulative distribution with an exponent a ~ 3, well outside the Levy 
regime (0 < a < 2). 

PACS numbers: 89.90. +n 
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The asymptotic behavior of the increment distribution of economic indices has long been 
a topic of avid interest Conclusive empirical results are, however, difficult to obtain, 

since the asymptotic behavior can be obtained only by a proper sampling of the tails, which 
requires a huge quantity of data. Here, we analyze a database documenting each and every 
trade in the three major US stock markets, the New York Stock Exchange (NYSE), the 
American Stock Exchange (AMEX), and the National Association of Securities Dealers 
Automated Quotation (NASDAQ) for the entire 2-year period Jan. 1994 to Dec. 1995 0. 
We thereby extract a sample of approximately 4 x 10 7 data points, which is much larger 
than studied hitherto. 

We form 1000 time series Si(t), where Si is the market price of company i (i.e. the share 
price multiplied with the number of outstanding shares), i = 1 . . . 1000 is the rank of the 
company according to its market price on Jan. 1, 1994. The basic quantity of our study is 
the relative price change, 

G t {t) =ln5 i (* + At) - In 3(f) ~ A(* + ~ j (1) 

where the time lag is At = 5 min. We normalize the increments, 

<&(*) = [Gi(t) - (Gi(t))\ /vi , (2) 



where the volatility = y {(Gi(t) — {Gi(t))) 2 ) of company i is measured by the standard 
deviation, and (. . .) is a time average ||. 

We obtain about 20,000 normalized increments gi(t) per company per year, which gives 
about 4 x 10 increments for the 1000 largest companies in the time period studied. Figure 
|TJa shows the cumulative probability distribution, i.e. the probability for an increment larger 
or equal to a threshold g, P{g) = P{gi(t) > g}, as a function of g. The data are well fit by 
a power law 

P(g) ~ g- a (3) 

with exponents a = 3.1 ± 0.03 (positive tail) and a = 2.84 ± 0.12 (negative tail) from two 
to hundred standard deviations. 



In order to test this result, we calculate the inverse of the local logarithmic slope of P(g), 
7~ 1 (g) = — d log P(g)/d log g ||. We estimate the asymptotic slope a by extrapolating 7 
as a function of 1/g to 0. Figure |l]b shows the results for the negative and positive tail 
respectively, each using all increments larger than 5 standard deviations. Extrapolation of 
the linear regression lines yield a = 2.84 ± 0.12 for the positive and a = 2.73 ± 0.13 for 
the negative tail. We test this method by analyzing two surrogate data sets with known 
asymptotic behavior, (a) an independent random variable with P(x) = (1 + x)~ 3 and (b) an 
independent random variable with P(x) = exp(— x). The method yields the correct results 
3 and 00 respectively. 

To test the robustness of the inverse cubic law a ~ 3, we perform several additional 
calculations: (i) we change the time increment in steps of 5 min up to 120 min, (ii) we 
analyze the S&P500 index over the 13y-period Jan. '84 - Dec. '96 using the same methods 
as above (Fig. |Tjc and d), (Hi) we replace definition of the volatility by other measures, such 
as Vi = (\Gi(t) — (Gi(t))\). The results are all consistent with a = 3. These extensions will 
be discussed in detail elsewhere (TD| . 



To put these results in the context of previous work, we recall that proposals for P(g) 
have included (i) a Gaussian distribution |]J, (ii) a Levy distribution P JTT| , |T2"| ] , and (Hi) a 
truncated Levy distribution, where the tails become "approximately exponential" |§. The 
inverse cubic law differs from all three proposals: Unlike (i) and (Hi), it has diverging higher 
moments (larger than 3), and unlike (i) and (ii) it is not a stable distribution. 
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FIG. 1. (a) Log-log plot of the cumulative probability distribution P(g) of the nor- 
malized price increments Qi{t). The lines in (a) are power law fits to the data over the 
range 2 < g < 100. (b) the inverse local slope of P(g), 7(5) = — (dlogP(g)/dlog g)" as a 
function of 1/g for the negative (o) and the positive (+) tail respectively. We obtain an esti- 
mator for 7(5), by sorting the normalized increments by their size, g^ > g^ > ... > g^ ■ 
The cumulative density can then be written as P(g^) = k/N, and we obtain for the lo- 
cal slope ~/(g {k) ) = -log( 5 ( fc+1 ) / ff ( fc ))/log(P( 5 ( fc+1 ) /P{g {k) )) ~ k{\og(g^ k+ ^) - \og(g^)). 
Each data point shown in b is an average over 1000 increments g^ k \ and the lines are 
linear regression fits to the data. Note that the average m~ l J2k=i l(d^) over an events 
used would be identical to the estimator for the asymptotic slope proposed by Hill Q. (c) 
Same as (a) for the 1 min S&P500 increments. The regression lines yield a = 2.93 and 
a = 3.02 for the positive and negative tails respectively, (d) Same as (b) for the 1 min 
S&P500 increments, except that the number of increments per data point is 100. 
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